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Our goal is to obtain a complete set of angular observables arising in a generic multi-body process. 
We show how this can be achieved without the need to carry out a likelihood fit of the angular 
distribution to the measured events. Instead, we apply the method of moments that relies both on the 
orthogonality of angular functions and the estimation of integrals by Monte Carlo techniques. The 
big advantage of this method is that the joint distribution of all observables can be easily extracted, 
even for very few events. The method of moments is shown to be robust against mismodeling of 
the angular distribution. Our main result is an explicit algorithm that accounts for systematic 
uncertainties from detector-resolution and acceptance effects. Finally, we present the necessary 
process-dependent formulae needed for direct application of the method to several rare decays of 
interest. 


I. INTRODUCTION 

Our initial motivation for studying what we wish to 
call the method of moments is the determination of 
angular observables in the rare FCNC-mediated decay 
B —> X* * * § (—>■ Ktt)£ + £^. However, the method we de¬ 
scribe in the following is general and applies to all decay 
or scattering processes that can be formulated in terms 
of an orthogonal basis of angular functions. We find a 
previous work [I] that advocates this method chiefly for 
the determination of angular observables in non-leptonic 
B decays but also mentions the applicability to semilep- 
tonic decays. Our aim is to improve upon this previous 
work by studying the uncertainties that are introduced 
by mismodeling of the angular distribution, and by work¬ 
ing out a recipe to determine and unfold detector effects. 
The latter is crucial for the application of the method 
to real data. We show that the method of moments has 
several major advantages over the usual approach based 
on likelihood fits: 

1. Likelihood fits have convergence problems for 
a small number of events, and can require 
reparametrizations and/or approximations for a 
successful fit to the signal PDF. As an example, 
see the LHCb analysis of the angular distribution 
in B —> K* fi + decays [2], 

The method of moments does not require any such 
reparametrizations or approximations. 
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2. Likelihood fits can be unstable in case the under¬ 
lying physical model is only partially known. This 
can lead to overestimating the number of physi¬ 
cal parameters, and consequently inhibits the con¬ 
vergence of the fits. As an example, this type 
of problem occured in toy studies of the decay 
B —i K*(—> Kn)i + £~ as reported in pjj]. It was 
subsequently solved when a missing symmetry re¬ 
lation between the angular observables was found 
and applied, thereby reducing the number of fit pa¬ 
rameters. 

In contrast, we will show that the method of mo¬ 
ments does not require information on the correla¬ 
tions between model parameters as an input. In¬ 
stead, it yields the correlations as an output, at the 
expense of somewhat larger uncertainties. 

3. Mismodeling the underlying physics model can re¬ 
sult in systematic bias in likelihood fits. 

We will show that the method of moments is insen¬ 
sitive to a certain type of mismodeling; i.e. intro¬ 
ducing a cutoff in a partial-wave expansion of the 
signal PDF. 

4. Using the method of moments, the joint probabil¬ 
ity distribution of the angular observables rapidly 
converges towards a multivariate Gaussian distri¬ 
bution. This allows an easy transfer of correlation 
information from the experiments to interested the¬ 
orists. 

We continue with basic definitions that pertain to an¬ 
gular observables, and our results in the subsequent sec¬ 
tions. Let denote the set of all angles, and let V de¬ 
note the set of all other non-angular kinematic variables 



2 


needed to fully specify the final state of the process un¬ 
der study. For example, v may include invariant masses 
or center-of-mass energies. We define an angular observ¬ 
able Si as a coefficient in the probability density function 
(PDF), P(z7, $), of the process by 

(i) 

i 

Here, the dependence on the decay angles •& has been 
explicitly factored out in terms of the angular functions 
{/;('$)}. We assume there exists a dual basis of functions 
{/* W} such that the orthonormality relations 


state and two leptons in the final state, we list the dual 
bases in a series of appendices [A] through [C] Note that a 
similar analysis was done in [T] for the decays B —y J/ibd) 
and B —y J/ipK*. 

In the remainder of this letter we discuss how to ob¬ 
tain an angular observable Si{V) in an experimental setup 
where each recorded event is (approximately) distributed 
according to P. We establish the statistical basics in sec¬ 
tion |TTJ Section |III| is dedicated to the impact of system¬ 
atic effects such as mismodeling the underlying physics 
or detector acceptance effects. Numerical studies for one 
uni-angular and one triple-angular distribution are pro¬ 
vided in section m 


f d$f i 0)f j 0)=5 ij (2) 

Jn 

hold with representing the full angular phase space 
relevant to the process. For particle decays, P is generally 
expressed in terms of the fully differential decay width, 


II. SAMPLE-BASED DETERMINATION 

The orthonormality relations eq. © imply that a single 
angular observable S) can be projected out of the full 
PDF P as 


P(z7,tf) 


i d 2 r 
rd?d# ’ 


( 3 ) 


where T is the total decay width. For a scattering pro¬ 
cess, one can similarly use 
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a dDdd ’ 


( 4 ) 


where the total cross section a is used for the nor¬ 
malization. Since the determination of the total decay 
width or total cross section can be quite difficult, we 
emphasize that different normalizations for P can be 
used. For instance, the total decay width (or cross 
section) of the process of interest can be replaced 
by the corresponding quantity of a control-channel 
process. This change of normalization is equivalent to 
a linear rescaling of the angular observables {£)}; thus 
ratios or similar suitable combinations of the angular 
observables are not affected by a change of normalization. 


Our method is an extension of the classical method of 
moments with orthogonal functions [U sec. 8.2]. The 
only difference is that conventionally the angular func¬ 
tions are assumed self-dual , /) = /). However, it suf¬ 
fices that the system of angular functions {./)('$)} can be 
transformed into an orthonormal basis. We find it conve¬ 
nient to work in the basis of Legendre polynomials that 
are not self-dual. Our approach covers the self-dual case, 
provided that one replace /) —y /) appropriately. Using 
the ansatz 


fi — > O'ij fj ) 


( 5 ) 


Si(*)= / d VP(vJ)fi0). (6) 

J n 

where {fi} denotes a dual basis of angular functions, 
and Q represents the entire angular phase space. In 
general, {fi} may differ from {/,}. This is the case for 
our selection of applications in appendices [X] through [C| 

It is sensible to refer to the angular observable S) as 
the fi-moment of the PDF P. We emphasize that a 
relation of type eq. © holds for any combination of a 
density written as in eq. ([l]) and an orthonormal basis 
of angular functions {fi}', i.e., there is no unique ba¬ 
sis of angular functions. For the proof we refer to ref. [T|. 


Integration over the non-angular variables yields 


{Si) = [dvSi{v)= fdv ( d'dP{v,'d)fi{'d) 

J J Un 


( 7 ) 


The remainder of this section describes the method of 
moments, in which we replace the analytical integration 
by Monte Carlo (MC) estimates. The central tenet of MC 
integration is the fact that the expectation value Ep[g\ of 
some function g{x) under the probability density P{x), 


Ep[g\ = J dxP(x)g{x), (8) 

can be approximated by the consistent and unbiased es¬ 
timator Ep[g] [U sec. 8.2] 


___ i JV 

E P \g]^E P [g)^-^2g{x^),x^ (9) 

n =1 


the dual basis needs to be worked out case by case 
through solving the linear system of equations © . For 
a selection of hadron decays with a b quark in the initial 


due to the strong law of large numbers for N —y oo, 
assuming that the variates x- K n = 1 are 

distributed as P. Throughout this letter we denote all 
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MC estimators with a wide hat. 


Application of eq. then yields 
_ N 

(S i )^(S i ) = -^2f i (x^). ( 10 ) 

n =1 


It is often of interest to obtain observables integrated over 
certain bins of V. We define 


( s i) S b= / ^S^v) 


= / dv 


la \_J £1 


d fipp, ■&)&(■&) 


= / dv 


dl?P(i7,0)/i(0)l(3 < v< b) 


U n 


( 11 ) 

( 12 ) 

5 

(13) 


where the argument of the indicator function l(a < v < 
b) is to be interpreted componentwise. Application of 
eq. © immediately yields 

_ i N 

{■ S i) S ,S = x E <v<b). (14) 


For notational simplicity, let us forget about the v in¬ 
tegration and consider only S%. In the limit TV —> oo, 
the central limit theorem (CLT) implies that the random 
vector 

§ = (So,...,£,...) (15) 

follows a multivariate Gaussian distribution A f(S, S) cen¬ 
tered on the true value S with the covariance esti¬ 
mated as 


Y, Z j = Cav[Si, 5j] 

->^- = C w[Si,Sj] 


TV- 1 




(16) 


n—1 


In our physics applications, the parameter space is 
compact and each _/) is bounded. Hence the requisites 
for the most basic version of the CLT to hold — finite 
mean and covariance of /) — are automatically satisfied. 
In our numerical analysis the sample covariance rapidly 
converges towards the true covariance matrix; see also 
section m 


Compared to the usual maximum-likelihood approach, 
we find for the method of moments: 

1. The angular observable Si can be determined inde¬ 
pendently of any other observable Sj. It is therefore 
much more robust to physics assumptions needed to 
define the full likelihood. In particular, this means 


one does not have to be specific regarding the form 
of new-physics contributions; in fact, one does not 
even need to be able to explicitly formulate the like¬ 
lihood at all. 

2. It is superior for a small number of samples TV. 
Likelihood fits tend to be numerically unstable if 
lots of parameters need to be estimated from sparse 
data. This is more severe if the mode of the likeli¬ 
hood is near the boundary of the physically allowed 
region [5]. For some of these decays of interest, 
there are only O (100) events recorded per bin. 

3. The estimate is unbiased for any TV. In contrast, 
the maximum-likelihood estimate has a bias of or¬ 
der 1/TV [6.. In practice, one should keep in mind 
the bias-variance trade-off: it is a well known phe¬ 
nomenon that removing the bias usually leads to 
an increase in variance of the sampling distribution 
of the estimator and vice versa (4j sec. 7.3]. From 
a Bayesian decision-theory point of view, both con¬ 
tribute similarly to the expected loss associated 
with deciding on just one value of the unknown 
parameter. One should therefore not prefer the 
method of moments over likelihood fits just because 
the former reduces the bias (7] sections 13.8,17.2]. 
In fact, for the results discussed below in section |TV| 
the likelihood fits — if they converge — exhibit a 
negligible bias and produce uncertainties 10%-30% 
smaller than those from the method of moments. 

4. The approximate multivariate Gaussian distribu¬ 

tion of S allows easier and more precise transfer 
of the information in the data to interested theo¬ 
rists for more accurate fits of standard-model and 
new-physics parameters [MED], or for more precise 
predictions of optimized observables; see e.g., [TR - 
im for definitions of such optimized observables in 
B —» K*l + t~ decays, [IS] for application to the 
decay B —► mr£~V(, and m for observables in 
Af, —► A (—> While the likelihood also 

approaches a multivariate Gaussian as TV — > oo, 
the two methods differ in their utility as input for 

theorists if S is not well inside the physical region. 
For example, suppose there are two angular observ¬ 
ables that are constrained to a triangular region by 
phase-space or unitarity arguments as 


|Si| + S 2 < 1, Si € [-1,1], s 2 e [0,1]. (17) 

It may (and often does) happen in practice that 

S is close or even outside the allowed region such 
that a significant part of the probability mass 
covers unphysical values. In a Bayesian fit, one 

would take Af(S\S, E) as the sampling distribution 
of the “data”, and simply set a uniform prior on the 
triangle in the S plane defined by eq. (17) to have 
a well defined problem. This could be trivially 
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FIG. 1. Decay topology for decays B —¥ P\Pii\ii- 


combined with other independent information in a 
global fit. Someone with a different physics model 
might have to consider a different physical region, 
and could incorporate it just as easily. 


For a likelihood fit, the constraint needs to be part 
of the analysis performed by the experimental col¬ 
laboration, and the resulting likelihood as a func¬ 
tion of S may be distinctly not Gaussian. Commu¬ 
nicating such a result has proved to be challenging 
due to technical reasons (such as data formats, size 
etc.) This leads to the undesirable situation that 
only the mode and standard errors are reported, 
and theorists often include the results as indepen¬ 
dent measurements with a Gaussian distribution 
and disregard the boundary problem as well as cor¬ 
relations altogether. 


III. SOURCES OF SYSTEMATIC 
UNCERTAINTIES 


In section |TTJ we assume that the PDF P describes the 
underlying physics accurately, and that the experiment 
observes each event with perfect accuracy. In order to 
estimate systematic uncertainties, we lift these assump¬ 
tions. 


A. Mismodeling due to Contributions by Higher 
Partial Waves 


decays B —> PxP-z^i as an exampl^] 

Within existing analyses, the PDFs of these decays are 
usually expressed in terms of one or a few partial waves 
of the dimeson system. However, the angular momentum 
of the dimeson system is unbounded from above, and 
gives rise to an infinite set of angular observables. 


For the selected class of decays, we can describe that 
problem as follows: The PDF P has a fixed dependence 
on the dilepton helicity angle $1 and the azimuthal angle 
t? 3 . (See also appendix [C| for details on the angular dis¬ 
tribution.) However, at the level of decay amplitudes the 
dimeson system can have an arbitrarily large total angu¬ 
lar momentum j; only its third component is restricted 
to j z = —1,0, +1. It is then convenient to compute the 
angular distribution explicitly in terms of di and $ 3 , but 
leave the angular observables dependent on the remain¬ 
ing helicity angle $2 of the dimeson: 


P(cosi?i, cos $ 2 ,^ 3 ) 

= ^5 fe (f7,cos'd2)/fe(cos'd 1 ,'d 3 ). (18) 

k 


We advocate here that is a sensible procedure to per¬ 
form an expansion in terms of Legendre polynomials 
with respect to the remaining angle t? 2 - Here, the 
angular-momentum indices l and m follow from the usual 
rules for addition of the angular momenta j and j of the 
partial-wave expansion of the underlying amplitude and 
its complex conjugate: j —j < l < j +j, and to = j z +j z - 
For the decay at hand, we consider the partial-wave 
expansion for the angular observables (see [ 22 ] and ap¬ 
pendix [d| 

Sk ^ = Ij ^' with r = / i( i7 )- i ^> ( 19 ) 

which reads 

Sk{v, cos'd2) 

OO 

= Y] -<S'fe,i( i7 M (|m|) (c osi? 2 )- (20) 

The normalization factor ni \ m \ is defined in eq. (|D3|). 
Within our example we have (cf. also appendix |C| and 

m) 


fo, k = 1 , 2,6 

|m| = < 1, fc = 4,5, 7,8 . (21) 

[ 2 , k = 3,9 


In several interesting processes we might only have 
an approximate result for P. In this section, we focus 
on one particular class of mismodeling of the signal 
PDF: the angular-momentum cutoff in partial-wave 
expansions. This mismodeling potentially affects a large 
number of decays and scattering processes. For the 
sake of clarity we take the interesting class of four-body 


1 This includes the rare b —> s mediated B decay B —y Kir Pi - , 
and the V u b suppressed decay B —> tyttP iif. For both examples 
the PDF P is known in the small-width approximation and when 
assuming a pure P- wave resonant final state. An extension to 
S — P interference has been studied for B — » Kirl + £~ [141 |20l . 
and for B —> irirK vp W- For a first study of S, P and D 
interference, see lU 
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The angular observables S/j - as defined in eq. (201 
have the merit of a well defined total angular momentum, 
and thus are physically distinguishable. As a conse¬ 
quence of the orthogonality of the Legendre polynomials, 
any mismodeling (or rather, lack of modelling) of higher 
partial-wave observables does not affect the method of 
moments as discussed in the previous section. That is 
to say, adding further (orthogonal) terms to the PDF 
only appends observables to S, but does not change 
the leading elements. The same applies to the covariance. 


Unfortunately, this benefit on the experimental side 
is accompanied with a theoretical draw back. Each 
observable Sk,j{v) consists of an infinite sum of bilinears 
of partial-wave amplitudes. It remains for theoretical 
analyses to estimate or calculate the impact of partial 
waves beyond the S and P wave contributions. (For 
B —> Kn£ + £~ a first study has been carried out where 
contributions up to the D wave are investigated ED- 

We wish to emphasize that detector acceptance effects 
systematically affect the expansion for any basis of angu¬ 
lar functions, including the one suggested in this section. 
Nevertheless, the expansion in terms of Legendre poly¬ 
nomials as suggested above provides means to cope with 
these effects, as we discuss in the following subsection. 


B. Recipe for Including Detector Effects 

Ascertaining a detector’s performance to detect signal 
events with accurate determination of the event’s angles 
is generally a difficult task. In the ideal case, one would 
have an explicit probabilistic model of the detector accep¬ 
tance and could thus write down the full forward model 
from which the measured events arise. In practice, that is 
not feasible, and one is forced to simplify the model. The 
standard approach is to generate the true particle events 
= {(£^ n \#["^)}, n = from a PDF as¬ 

sumed to describe the bare physical process, and to prop¬ 
agate those particles through a detailed simulation of the 
detector. The observable traces that the particles leave 
in the detector are fed into reconstruction algorithms re¬ 
sulting in the detector events {a^} = 
n = 1,..., A r f j with N d < N t . In general, the distribution 
of the detected events is 

Pd{xd) = J J dz t Pt{xt)E(x d \x t ) . (22) 

Here P t is the probability distribution of the true events, 
the normalization constant R is given by 


where the PDF P(a;d|a:t) models the resolution effects 
and the unnormalized density e{xt) is the detector accep¬ 
tance function. Perfect resolution corresponds to 

P(x d \x t ) = 5(x d - x t ). (25) 

In what follows, we propose a systematic method to 
unfold all effects of E{x d|x t ) through MC simulations 
and the method of moments, using that E{xd\xt) can be 
expanded — at least formally — in Legendre polymials. 
For illustration, we proceed with the explicit example of a 
uniangulai[^]PDF with x = cos'd. Let us define the PDF 
in terms of the Legendre polymials (i.e., fk{x) = Pk{x)) 
and angular observables S as 


P t (x t ) = P t (x t \S) = Y,Skfk(x t ), (26) 

k 

where k = 0,1 ,... denotes an angular-momentum-like 
index associated with the observables. Normalization 
of Pt is equivalent to choosing Sq = 1/2. Requiring 
Pt(x t) > 0V;Et implies \Sk\ < 1/2 for k > 0. More strin¬ 
gent relations between the Sk might hold, but are of no 
concern here. For later use, we define 


g(m) _ 


f 1 / 24 , 0 , rn = 0 

(1/2(4,oT4,m), 771 >0, 


and note that 


P^(x t ) = P(x t \S^) 


(27) 


(28) 


is a valid PDF. The dual basis of angular functions fol¬ 
lows then from the normalization and orthogonality of 
the Legendre polynomials, and one therefore has 


h{x) = 2k 1 fk(x ), 


r +1 


f-1 


dxf k (x)fi(x) = 6 k ,i ■ 


(29) 

(30) 


We now define the simulated raw moments Q^ as 

= JJ dx t dxd fi(xd)P < ' m \x t )E(xd\x t ), (31) 


which are instrumental to our recipe. Monte Carlo esti¬ 
mators of these moments can be constructed from specif¬ 
ically crafted detector events x^ n,m \ n = l,...,A^ m \ 
where 

x ( d’ m) ~ p e * 1 \x d) = j dx t P (rn) {xt)E{x d \xt). 

(32) 


R = JJ dx t dx d Pt(*t)P(a;d|a;t) • (23) 

The kernel E(xd\x t ) is usually decomposed as 

(24) 


2 The generalization of this section to multiangular PDFs is 
straighforward. It can be achieved by promoting x to a vector, 
promoting the Legendre polynomials to products of independent 
polynomials or spherical harmonics, and promoting the indices 

i, j, k , m to multi-indices. 


E{xd\xt) = £(a: t )P(a;d|a;t) 






6 


In words, for each m it is required to generate events 
from a toy physical distribution P'~ m \ for which So = 
S m = 1/2 and all other observables are set to zero. 
Next, propagate these events through a detector sim¬ 
ulation. The normalization R( m ' > is chosen such that 
/ Pe % \ x d)d£d = 1. We emphasize that R( m ' ) can be 
estimated as R= N ( ^ m> /N t , where N t corresponds 
to the number of simulated true events. The estimators 
then read 


4 “ N d 


N d 

£/,(A””) 


n 


(33) 


Linearity of the integral over x and convergence of the 
expansion of P^ in terms of Legendre polynomials en¬ 
sures that 


condition is tightly related to the determination of 
the branching ratio of the underlying decay, and we 
therefore suggest to carry out a combined analysis 
for the determination of the branching ratio and the 
extraction of the angular observables. Moreover, in 
the applications to B decays only ratios and similar 
//-independent combinations of the angular observables 
are of interest. 


We note that S as determined from eq. (37) does not 
fully correspond to S for arbitrary detector acceptance 
e. Assuming an expansion of e in terms of Legendre 
polynomials up to a given order L, we have to calculate 
the raw moments up to dim Q = dim S = dim S + L. The 
MC estimators of the corrected angular observables then 
take the following structure: 




(34) 


We call the matrix M 1 the unfolding matrix, which is 
specific to t he decay at hand. Given our definition of 
Si m} in eq. (271 it is easy to see that 


Mij — 


f 2 Q[ 0) 

\2 (qP - q[ 0) ) 


3=0, 

j/0, 


(35) 


and its MC estimator M can be obtained through the 
replacements —> Q- m ' ) . 


In order to finally extract the angular observables 
from data, we use the measured raw moments. Their 

MC estimator Q — based on the detected events x^ n \ 
n = 1,... ,N — reads 


^ = R^J2Mx^). (36) 

n 

We then obtain MC estimators of the angular observables 
via 


S = 


M~ 


Q. 


(37) 


Apparently we now face a circular dependence. On the 

one hand, the estimators Q and thus also S are propor¬ 
tional to R, the ratio of detected events over occuring 
events. On the other hand, R depends by construction 
(see eq. (231) on P t (x t ), and thus on the true value of 
the angular observables S. This dependence is broken 

by the fact that the MC estimators S need to fulfill the 
self-consistency condition 


S Q = l 


for a uniangular distribution. (38) 


(For the process-dependent conditions in the multiangu- 
lar case see eq. (B9l and eq. (C6l.) This self-consistency 


S — (So, , *Sdim S'—1; ^dim 3, ■ ■ • 5 ^dim Q— l) ■ (39) 


physical 


“superfluous” 


The method is consistent as long as the MC estimators 
for the “superfluous” observables are compatible with 
zero. |^] The value L depends on the setup of the 
particle detector under consideration, and remains to be 
determined just as in studies that carry out a likelihood 
fit. 

The accuracy of the unfolding process as outlined 
above critically depends on both the accuracy of the 
detector simulation, as well as the uncertainties induced 
by the MC estimates. For an experimental analysis, one 
would now turn to the determination of the distribution 
of S as a function of the detector setup. This would 

involve the determination of both M and Q for a 
number of detector configurations, and subsequent 
profiling or marginalization. While such considerations 
of any detector simulation are beyond the scope of this 
work, we can, however, comment on the MC-induced 
uncertainties. As usual, one needs to find a balance 
between compute time and accuracy. For a uniangular 
distribution and O (lO 6 ) MC samples, we find that the 
error on the mean of each matrix element is O (l0~ 4 ). 
This suggests that the so-induced systematic error can 
be driven below any statistical uncertainty. 

An alternative method to unfold detector effects is 
weighting the data on an event-by-event basis, with each 
weight corresponding to the inverse of the detection 
efficiency. 


3 This holds only for the physical model of uniangular decay distri¬ 
butions. If the physical model involves a partial-wave expansion 
with cutoff, these superfluous observables correspond to higher 
partial waves that have been suppressed in the physical model; 
see appendix [Cl for such a case. 
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Let us conclude this section by commenting that parts 
of the unfolding matrix are universal in a sense: They 
can be reused in analyses with a similar underlying decay. 
Therefore, computing resources spent on improving the 
accuracy of M are not wasted. 


IV. TOY STUDIES 

We now study the performance of the proposed 
method. In order to do so, we simulate individual events 
for two separate physical processes: one uni-angular, 
and one tri-angular decay distribution. We repeat the 
analysis for varying sample sizes, ranging from 50 to 500 
events. Our toy analyses are based on SM predictions 
for angular observables in the decays B —> K£ + £~ 
and B —> K*{—> Ki:)t + £~. In order to faithfully 
investigate the performance of the method of moments, 
we repeat our numerical studies for several bins in the 
kinematic range 1 GeV 2 < q 2 < 6GeV 2 , as well as 
15GeV 2 < q 2 < <?max- Here, the bin width is chosen 
either as 1 GeV 2 or 0.5 GeV 2 . This setup is meant to 
ensure that a wide spectrum of possible values for the 
angular observables is investigated. 

Our findings can be summarized as follows: 

• In all studied cases we observed not a single bias in 
the distribution of the pull of any observables Si, 

puU.M. (40) 

& i 

Here S, refers to the true (input) value for the angu¬ 
lar observables, Si refers to the mean of the pseudo 

^ —- 1/2 

measurement via the MC estimate, and <jj = 
refers to an estimator of the standard deviation of 
the pseudo measurement. All pull distributions ob¬ 
tained in our studies can be successfully fitted to a 
Gaussian distribution. Out of the large number of 
studied distributions, we only show the pull distri¬ 
butions for the observables Sr, — 27% and <SV — 2% 
obtained from SM-like B —> K*£ + £~ decays as 
representative examples. We generated 2 • 10 5 toy 
studies with 200 events per study in figure [2] 

• We study the MC estimate for the absolute uncer¬ 
tainty <Ji(N) with respect to the angular observable 
S, as a function of the number of simulated events 
N. As expected for a multivariate Gaussian dis¬ 
tribution, we find that the absolute uncertainty is 
well fitted by 


Vi(N) = 


CTj 

Vn 


(41) 


with ai(l) = O (l), regardless of the absolute size of 
Sk ■ The latter can best be shown for the example of 
uncertainties of two observables. Taking again S$ 


(~ 27%) and S 7 (~ 2%) for SM-like B -»• K*i+i~ 
decays, we show the absolute uncertainty in fig¬ 
ure [H We find that the the method of moments 
yields uncertainties on S that are roughly 10% - 
30% larger than those obtained from maximum- 
likelihood fits and for the same number of events. 
However, we wish to note that said fits only pro¬ 
duce a limited subset of the angular observables, 
and the statistical error of the fit is expected to in¬ 
crease with the number of fit parameters until their 
errors saturate the statistical errors of the method- 
of-moments estimators [ 231 sec. 8]. 

• We also compare the results as obtained by the 
method of moments with results obtained by a con¬ 
ventional likelihood fit. In particular, we study 
the correlation between the method-of-moment es¬ 
timators and the maximum-likelihood-fit estima¬ 
tors. We run 10 3 toy analyses, with 200 simu¬ 
lated events per analysis. We show the joint dis¬ 
tribution of the two estimators in figure [4] The 
two estimators are highly correlated. The distribu¬ 
tion of the difference of the estimators exhibits now 



FIG. 2. Pull distribution for the angular observables Sr (up¬ 
per) and SV (lower), extracted from 2 • 10 5 studies of 200 sim¬ 
ulated events each for the decay B —> K*£ + £~. The red curve 
represents a fit to a Gaussian distribution. 
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FIG. 3. Uncertainty a 5 of the angular observable S 5 ex¬ 
tracted from 2 ■ 10 5 studies of simulated events for the decay 
B —> I\*£ + £~. We show the uncertainty as a function of the 
number of simulated events N. T he red curve represents a fit 
to the function given in eq. (411. The error bars correspond 
to the 68 % spread of measured uncertainties in the toys. The 
plot for <77 is visually indistinguishable from the one shown 
here. 


bias, which is due to the large number of simulated 
events. Still, we find that statistical uncertainty on 
the difference of the two estimators is sizeable and 
can easily become half as large as the statistical 
uncertainty of either estimator. 

We emphasize that the above results have been ob¬ 
tained for a flat acceptance function. We also study the 
behavior of the unfolded angular observables. For sim¬ 
plicity we limit our study to the decay B —> K£ + £~, 
with its three angular observables So to S 2 - We express 
the acceptance function in terms of Legendre polynomials 
Pk{x), 



FIG. 4. The joint distribution of the two estimators S'f IoM and 
Sf 14 that arise from the method of moments and a maximum- 
likelihood fit, respectively. We show the estimators for both 
of our benchmark observables S 5 (upper) and S 7 (lower). 


V. SUMMARY 


7 4 

e(cosi9) = —po(cos'd) - —p 2 (cos ££). (42) 

15 15 

This acceptance function approximates the one used in a 
recent study of the angular observables in B —> Kl + i~ 
decays [23j. Focusing on effects of the unfolding process 
itself, we use the analytical expression for the unfolding 
matrix, which we compute from the raw moments as 


defined in eq. (311. Simulting 4000 toy analyses with up 


to 300 simulated events each, we find that the previous 
bullet points still hold; i.e., we do not find any bias, 
and the distribution is well described by a multivariate 
Gaussian distribution. The latter only holds as long as 
the number of events per experiment exceeds ~ 30. 


We have carried out a combined analytical and nu¬ 
merical study of the method of moments; a method for 
the extraction of angular observables from the angular 
distribution of a general multi-body process. We have 
studied the performance of the method of moments 
using pseudo data derived from the SM predictions for 
one uniangular decay (B —> K£ + £~) and one triangular 
decay (B —► K*£ + £~). From this, we find rapid conver¬ 
gence of the joint likelihood of the angular observables 
towards a multivariate Gaussian. We draw the conclu¬ 
sion that this method exhibits several benefits in the 
determination of angular observables when compared 
with a maximum-likelihood fit. 


All of our toy studies, as summarized above, show con¬ 
sistently that the joint distribution of the angular observ¬ 
ables converges rapidly towards a multivariate Gaussian 
distribution. We therefore propose to publish the results 

in the form of the physical components of S and E. 


First, we find no bias in the determinations of the 
angular observables even for a small number of events. 
However, due to fewer model assumptions, the uncer¬ 
tainty on the mean values increases by roughly 10%-30% 
compared to likelihood fits. 

Second, in the absence of detector effects, the method 
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of moments does not rely on model assumptions for the 
partial-wave composition of the PDF. This is explicitly 
shown for the case of higher partial waves in multibody 
final states. 

Third, we develop a systematic method for the de¬ 
termination of detector effects that lead to dilution 
and mixing of the angular observables. We present an 
algorithm to calculate the necessary unfolding matrix, 
which is computationally feasible only when using the 
method of moments. The algorithm also accounts for 
higher partial waves. 

Fourth, the joint distribution of the angular observ¬ 
ables resulting from the method of moments is well 
approximated by a multivariate Gaussian distribution 
even for small number of events N ~ 30 both for the 
ideal uniform acceptance and a realistic example. This 
facilitates the precise transfer of correlation information 
to subsequent theoretical analyses. 

Last but not least, the resulting distribution arises 
without the need for additional model constraints. Thus 
more observables can be inferred from the same data 
than in a likelihood fit. In addition, the results from 
the method of moments can be more easily averaged or 
combined; e.g., in global fits. 


In conclusion, we argue that the method of moments 
is a competitive alternative to maximum-likelihood fits 
if angular distributions are involved. We wish to raise 
the interesting prospect of extending this method to ap¬ 
plications that feature PDFs composed from non-angular 
orthogonal bases. 
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Appendix A: Application to B ->• K£+£ 


The PDF for the decay B —► K£ + £ has been cal¬ 
culated for the most complete basis of dimension-six 


s£ + £ operators. It reads [IS 


d 2 r 


9 a \ 1 U 1 

,C ° S ^ = dT/dq 2 dq 2 d cos 

a(g 2 ) , Kg 2 ) 

dr /dq 2 dr /dq 2 


+ diW™ 2< ' 1 

= 5> Pi (costfi), 


(Al) 


with the conventional observables a(q 2 ) through c(g 2 ), 
and dr /dq 2 = 2a + 2/3c. We conveniently use the Leg¬ 
endre polynomials Pi{x), * = 0,1,2, 


Po(x) = 1 , Pi(x) = x , p 2 {x) = *(3x 2 - 1), (A2) 

as our basis of angular functions. Our basis of angular 
observables then translates to the conventional basis as 

*-£• (A3) 

In this case, the dual basis is simply given by pi(x) = 
(2 i + l)/2 Pi(x) such that 

dz?ip i (cosr?i)P(g 2 ,cosi?i)sinz? 1 = Si(q 2 ). (A4) 



Appendix B: Application to A b -)> A(— > Ntt)£+£- 


The PDF for the decay — in the presence of Standard- 
Model operators and their chirality-flipped counter parts 
— reads [T^] 


P(q 2 , cos t?i, cos i? 2 , $ 3 ) 


(dr / dq 2 ) - 1 d 4 r 
dg 2 d cos d cos $2 d $ 3 

^2 Si fi (cos 1 , cos , $3 ), 


(Bl) 

where q 2 denotes the dilepton mass squared, $1 = $1 
and $2 = denote the helicity angles in the dilepton 
and Nn systems, respectively, and $3 = <j> denotes the 
azimuthal angle. The index i should be interpreted 
as a multi-index, i = (li,l 2 ,m), where 0 < l\ < 2 
and 0 < l 2 < 1 denote the total angular momentum 
in the dilepton and the Nir system, respectively, and 
— 1 < m < 1 is the third component of either of the 
angular momenta. 


Our choice of an orthonormal basis reads 


fh,l 2 ,m( C0S1?1,C0S $2,^3) 

j ( l 1 - M)!(*2 - M)! 

V ( ; i + M) ! ( ; 2 + M)! 
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x pj™' 1 (cos (cos tf 2 ) 

! cos(|to|$ 3 ) m > 0 
sin(|m|i? 3 ) m < 0 , (B2) 

1 m — 0 


and its dual is 


(2li + 1)(2Z 2 + 1) I (l i — to)! (^2 — to)! 
87r y (Zi + to)! (l 2 + to)! 

xp£(cos 0 i)p£(cosi? 2 ) 

( 2cos(|to|$ 3 ) m > 0 
2sin(|m|^ 3 ) m < 0 . 

1 to = 0 


(B3) 


The correspondence between out choice of angular ob¬ 
servables in the angular momentum basis, and the angu- 
/ii! 2 ,m(cosi?i,cosi? 2 ,$ 3 ) lar observables as defined in reference m reads 


and 


and 


87r<S'o,o,o = 1, 87 tSo,i,o = 


Ko rr + 2 K, 


2cc -r 4J\2ss 

dT /d q 2 


87r5l -°-° “ d!W 


87t5i, = 


87t5i i1i0 = 


8 nSi 


8nS 2 ,i-i = 

o„c 2(A'i cc — K\ ss ) Q c 
87r <- 5 2 ,o,o — -dr /dq 2 -’ ° 7r ‘- ,2 > 1 >° — 

87r52,i,+i = 


6A' 4s 
dr/dg 2 ’ 
3 K 2c 
dT/dq 2 ’ 
6A' 3s 

■ 1,+1 dr/dg 2 ’ 

2v / 3A' 4sc 


dr/dg 2 ’ 
2{K 2cc ~ K2 ss) 
dr /dq 2 
2v / 3A" 3sc 
dr /dq 2 ’ 


where the decay width is 


dr 

dg 2 


— 2Ai ss + Ai cc . 


The dual basis is chosen such that 


H-1 


r+1 


r 2-rr 


dcosi?i / dcos$ 2 / d$ 3 P(g,cosi?i,cos$ 2 ,$ 3 )/i(cosi?i,cos$ 2 ,$ 3 ) = S)(g ). 


'-1 


'-1 


(B4) 


(B5) 


(B6) 


(B7) 


(B8) 


For the purpose of unfolding acceptance effects as laid and where j is now also a multi-index representing j = 
down in section |IIIB| it is instrumental to know that (Zi,Z 2 ,to). 

^* 0 , 0,0 — B and that maxcosi?!,cosi92,0 ^ b 


The recipe’s generating PDFs are therefore Appendix C: Application to B-> Kn£ + l~ 

P(*|{^ } }), with 


aU) _ J_ (0,0,0) j ~ (0,0,0) 

Stt I<Jj,( 0 , 0 , 0 ) + j 7 ^ (0,0,0) 


(B9) 


The PDF for the decay B —> Ktt£ + £~ — up to and 
including P-wave contributions — has been calculated for 
the most general basis of dimension-six b —► s operators. 
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It reads, expressed in terms of the angular observables 

{jj mm 


decay width reads 

dr (3Jl c — J2c) + 2(3Ji s — J 2s ) 

d q 2 3 


(C2) 


P(q 2 , cosi?i, cos^ 2 ,^ 3 ) = 


(dr /dq 2 ) 1 d 4 r 


d q 2 d cos d 1 d cos i ? 2 dt ?3 

= ^2Si{q 2 )fi{ cos-di,cos-d 2 ,'d3), 

(Cl) 

where $1 = i is the dilepton helicity angle; $ 2 = 'Ok is 
the Kt: helicity angle; $3 = <fi is the azimuthal angle; and 
q 2 is the square of the dilepton mass. The (^-differential 


It is convenient to define the basis of angular functions 
and its dual in terms of associated Legendre polynomials 
p/ n (x). The index i should thus be interpreted as a multi¬ 
index , i = (Zi,? 2 ,m), where 0 < l± < 2 and 0 < Z 2 < 2 
denote the total angular momentum in the dilepton and 
the Ktt system, respectively, and —2 < m < 2 is the third 
component of either of the angular momenta. We use the 
sam e ba ses of angular functions as given in eq. (B 21 and 
eq. (B3) for the decay Af, —»• hJL + t~. 


In that case, the angular observables correspond to the usual choice of observables via 


and 


and 


8nSi 


0,0,0 


= 1. 


87rS'i 


0,1,0 — 


3Jii — J; 


2 i 


dr /dq 2 


87rS'o j2 ,o = 


6(Jic — Jis) — 2 (J 2c — J 2s ) 

3dr /dq 2 


8^51^0,0 = 


Jqc T 2 Jqs 
dr /dq 2 


8t rS 1 !,! _! 


6 J 7 J 

dr /dq 2 ’ 


87tS'i i i j o = 0, 


87f<Si,i,+i 


6 J 5i 

dr /dq 2 ’ 


87rS’i i 2 _1 = 

87r5i,2,0 = 

8ttS'i i2i+ i = 


4V3J 7 

dr /dq 2 ’ 

2(^6 C dgs) 

dr /dq 2 
dV3J 5 

dr /dq 2 ’ 


87762,0,0 — 


4 (J 2c + 2 J 2s ) 
3 dr /dq 2 


87762,1,-1 


4V3 J 8l 

dr /dq 2 ’ 


87 r<S' 2 ,i,o — 

87762 , 1 ,+! = 


4 J 2 i 

dr /dq 2 ’ 

dr /dq 2 ’ 


87762 . 2 , -2 — 

87762.2, -1 = 
87752 2 0 = 

87762.2, +1 = 

877 5 2 , 2 ,+2 = 


8J9 

dT/dg 2 ’ 

8 J 8 

dT/dg 2 ’ 

8 (J 2c — 5 2s ) 
3 dr /dq 2 
8J 4 

dr /dq 2 ' 

8J3 

dr/dg 2 ■ 


(C3) 


(C4) 


(C 5 ) 


As before, for the purpose of unfolding accep¬ 
tance effects as laid down in section EH it is 
instrumental to know that /o,o,o = 1 , and that 

max cos , 9 ljCOS ,j 2j< £ \fi lt i 2tTn \ < 1. The recipe’s generating 
PDFs are therefore P(:r|{5- J) }) with 

gU ) _ J_ f^,(o,0,0) 3 = ( 0 , 0 , 0 ) 

8 tt |<5i,(o,o,o) + Sij j 7 ^ (0,0,0) 


To conclude this section, we remind that the angular 
momentum l 2 , associated with the angle $ 2 , is not 
bounded from above. This has to be considered if partial 
waves beyond the P-wave are included in the analysis, 
see e.g. m- However, our choice of basis is well suited 
to these applications with 0 < l < 00 . Note, that the 
physical range of m is not affected by higher partial 
waves. 


where also j is now a multi-index representing j 
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Appendix D: On the Partial-Wave Expansion of 
Angular Observables 

For convenience, we use this appendix to collect there 
necessary formulae needed in the partial-wave expansion. 
Let us assume that the angular decomposition has been 
achieved for some PDF P for all angles except for one 


angle d. We now focus on just one of the resulting ob¬ 
servables and denote it by S(d). The dependence on the 
non-angular variables will be ignored in the following. 
Suppose S has an expansion in terms of partial waves 
h,h = 0,1, 2,... = S,P,D,... of the underlying ampli¬ 
tudes Ai and A 2 , 


S(d) = F(A 1 (d)AUd)] = F 


/ OO \ / OO 

E A 

\l 1=0 / V 2—0 


*(^ 2 ) (m 2 ) 


cosd) 


(Dl) 


where F £ {Re,Im} denotes taking either the real or 
the imaginary part. Here p\ m ' 1 denotes an associated 
Legendre polynomial and mi is the third component of 
the angular momentum of the amplitude Ai, and we 
impose mi > m 2 - 

From the orthogonality of the Legendre polynomials, 
one immediately finds 

p OO 

/ dcos$ \Ai{d)\ 2 = ^2 I A i i>) \ 2n ii,m i < S'incl, (D2) 
J li=0 

where Si nc i is the corresponding observable in the asso¬ 
ciated inclusive decays, and where we introduce ni tTn via 
the scalar product of two associated Legendre polynomi¬ 
als, 


/* 1 

n l,mh,v — / d COS dp^ (cos (cost?) 

2 (l + m)l 
~ (21 + 1) (l — m)\ 1 ' 1 ' ‘ 


(D3) 


The positivity of the amplitudes in eq. (D2| implies that 
we can estimate the error introduced by cutting off the 
expansion at some arbitrary angular momentum L. One 
obtains 


( D4 ) 

1=0 


Will will show in the following that such a cutoff is 
compatible with defining a basis of angular observables 
as coefficients of Legendre polynomials in cos'd. Since 
this expansion implies a well defined total angular 
momentum for each observable, one ensures that the 
observables can in fact be disentangled experimentally. 


We decompose S in terms of the associated Legendre 
polynomials (cos $), with total angular momentum j 
and its third component m = m\ + m 2 - 


= 51 Sj,mp\ m) (costf). 
j 


(D5) 


This parametrization has two merits. First, we can im¬ 
mediately pr oject out the angular observables <Sj >m by 
means of eq. (D31: 


. rn — 


1 

j ,m 



d cos d S('d)p < J m ' > (cos d) . 


(D6) 


(Here and in the next step we may exchange the integral 
and the series because each element of the series is a 
product of polynomials on the compact support [-1,1], 
and thus each integral is absolutely convergent). Second, 
we can immediately express S hrn in terms of the partial- 
wave amplitudes, 


r+i 




b j,m J — 1 


dcos dp ( j Tll+m2 ^ (cost?) E F Ai 1 ^pl™ 1 \cosd)A 2 ^pl™ 2 \cosd) 


1 1 ,h=0 


= £ f\a+A 


(h) A *(h) 
2 


T, 






1 1 , 1 2=0 


rid 


(D7) 


with m = m\ + m 2 ■ 
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In the last step, we use Gaunt’s formula 

r+i 


to integrate a triple product of associated Legendre polynomials, 


T, 


(m 1 ,m 2 ) 


h,h ,j 


J-l 

= (-l) 
xD-i) 

t=p 


d cos $p( mi+m2 ^ (cos d)p^ 1 2 3 4 5 6 7 8 9 10 11 12 ' > (cos (cos •&) 


S — l l —7712 


2(Zi + toi)!(/ 2 + to 2 )!(2s — 2Z 2 ) !s! 


(h - mi)!(s - j)!(s - Zi)!(s - Z 2 )!(2s + 1)! 

t (j + m + Z)!(Zi + h - m - t)\ 

t\(j - m — t)!(Zi - Z 2 + to + Z)!(Z 2 — to 2 - t)\ ’ 


(D8) 


where 


to = TOi + to 2 , TOi > to 2 , 

j,h,h>0, to, mi, m 2 > 0, 

and 

j + h + Z 2 

2 ’ 

max(0, Z 2 — Zi — to) , 
min(Zi + Z 2 — m,j — to, Z 2 - m 2 ). 


s = 

P = 
9 = 


r 


(D9) 


(DIO) 


The necessary conditions for T ^ 0 are 

s G N A Zi — ^2 ^ ^1 t ^2? • 


(Dll) 


The latter condition is well known from the addition rules 
of angular momenta. Note, however, that the sum in 
eq. (Dll goes to infinitely high angular momenta l\ and 
Z 2 . As a consequence of this and of eq. (Dill, the angu¬ 


lar observables Sj }Tn consist of sums with infinitely many 
terms. It is then up to theoretical analyses to estimate or 
calculate the impact of the neglected partial waves, e.g. 
as outlined above. 
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